Below is the general workflow for replicating our analyses of trait evolution in Tiliquini skinks. While we provide the scripts to redo everything from scratch, we also include Rdata files that will give you the output from model fits, objects, etc., to speed up the process.
Load necessary packages
Load necessary packages
library(ape)
library(phytools)
library(ggplot2)
library(patchwork)
library(dplyr)
Let’s save some time and start by loading our data.
load("Egernia_Data.RData", verbose=T)
## Loading objects:
## all.modules
## egernia.tree
## etree
## egernia.tree.alt
## lessLSR
## smallLSR
plot(egernia.tree, cex=0.3); axisPhylo()
We’ll also quickly put all our traits into a single data frame for use later.
all.traits <- bind_cols(all.modules$head, all.modules$body, all.modules$tail, all.modules$limb,
all.modules$size, all.modules$neck, all.modules$eye)
And load some functions we’ll want later
source("../Scripts/trait.at.time.R")
Now let’s think about how different modules are integrated.
We will create some models to test the modularity hypotheses.
We’ll use the R package EMMLi to identify morphological modules
of the lizard body.
library(EMMLi)
We’ve designed some basic models to test.
Remember: no module can be made up of only a single
trait! If it is, code it as NA
model.small <- read.csv("Morph_Module_Models.csv", header=T)
res <- EMMLi(corr = corr.small,
N_sample = 56,
mod = model.small,
abs = T,
pprob = 0.05);
res
Based on our preferred module model we have saved our traits into the
all.modules object.
The preferred model corresponds to 4 modules (head, body, tail, limb).
Three traits (neck, eye, pelvis) are unintegrated and do not fit among
the other modules. In the process of size-correcting our traits we also
generate a new trait size, so we’ll keep that.
names(all.modules)
## [1] "head" "body" "tail" "limb" "size" "neck" "eye" "pelvis"
head(all.modules$head)
## Head_Width Snout_Eye Head_Depth Pos_Skull
## Tribolonotus_ponceleti 1.1881587 0.3631190 0.8845977 0.7614419
## Tribolonotus_pseudoponceleti 1.0713812 0.3390085 0.7727104 0.7987440
## Tribolonotus_blanchardi 0.8335979 0.3098250 0.6311908 0.6640375
## Tribolonotus_gracilis 1.1688530 0.3428757 0.9267157 0.7107731
## Corucia_zebrata 1.0718004 0.3011819 0.8046630 0.6062986
## Lissolepis_luctuosa 0.8110920 0.4167810 0.7283571 0.6141436
We might want to look at fitting multi-OU models to our modules. Let’s fit l1ou to all the modules and save the output.
l1ou.modules <- list(); l1ou.trees <- list()
#for (y in 1:length(all.modules)){
for (y in 1:4){
# Align the trait data and tree
trait.data <- adjust_data(egernia.tree, all.modules[[y]])
#### Estimate the number and position of shifts a priori
fit <- estimate_shift_configuration(trait.data$tree, trait.data$Y,
nCores=8, quietly=F, criterion="AIC")
# if you want to do only a single trait, 'data$Y[,x]'
# test for convergent regimes if you'd like (this is slow, and not really necessary)
#shift.fit <- estimate_convergent_regimes(fit, nCores=8, criterion="BIC")
# save the results to a list
#l1ou.modules[[y]] <- shift.fit
l1ou.modules[[y]] <- fit
# save the simmap trees to a list
#l1ou.trees[[y]] <- shifts.to.simmap.l1ou(shift.fit)
l1ou.trees[[y]] <- shifts.to.simmap.l1ou(fit)
#plot(l1ou.trees[[y]])
}
names(l1ou.modules) <- names(all.modules)[1:4]
names(l1ou.trees) <- names(all.modules)[1:4]
If you ran the above, save these results externally
save(l1ou.modules, l1ou.trees, file="l1ou_MorphModules.RData")
If you don’t have the time to run these, load them instead
load("l1ou_MorphModules.RData", verbose=T)
## Loading objects:
## l1ou.modules
## l1ou.trees
Plot all the trees with shifts
source("../Scripts/shifts.to.simmap.l1ou.R")
par(mfrow=c(3,3))
for(j in 1:length(l1ou.trees)){
plotColorSimmap(l1ou.trees[[j]]); axisPhylo(); title(paste(names(l1ou.trees)[[j]], (ncol(l1ou.trees[[j]]$mapped.edge)-1), "shifts"))
}
## no colors provided. using the following legend:
## 0 1 2 3 4 5 6 7
## "black" "#d3617d" "#47d1b3" "#c8c0f7" "#76f7ec" "#c964ea" "#bd07f9" "#ead185"
## 8 9
## "#aa8fe0" "#87e86a"
## no colors provided. using the following legend:
## 0 1 2 3 4 5 6
## "black" "#fce8a6" "#57dbab" "#e27a7f" "#9a96f7" "#f7c69e" "#d36eea"
## no colors provided. using the following legend:
## 0 1 2 3 4 5 6
## "black" "#50a303" "#e58632" "#f9d7bd" "#d1830e" "#f6ffa5" "#12ed7c"
## no colors provided. using the following legend:
## 0 1 2 3 4 5 6 7
## "black" "#ffb47f" "#21fc1e" "#d067ea" "#37e87b" "#f977b8" "#70d660" "#fc2d8e"
## 8
## "#d1304a"
We’ll start by fitting a standard set of comparative models (BM, OU,
EB) and Levy models in pulsR.
Code for running those models across all traits individually is
contained in RUN_pulsR.R.
We can use BayesTraits to estimate branch specific rates of trait evolution. This method uses a variable-rates Brownian Motion model, and is run externally to R. We can read in our results though. The next step is to run BayesTraits across all the traits individually. Like any Bayesian method, multiple independent chains should be run and compared to assess convergence.
Begin by saving each trait to a BayesTraits formatted file (following SaveTraitsToFiles_BayesTraits.R)
for(j in 1:length(all.modules)){
curr.mod <- all.modules[[j]]
for (k in 1:length(curr.mod)){
trait <- select(curr.mod, k)
tname <- colnames(trait)
filename <- paste0("~/Desktop/", tname, ".txt")
write.table(trait, file=filename, sep=" ", quote=F, col.names=F)
}
}
Format your BayesTraits control file.
7
2
VarRates
Burnin 10000000
Iterations 110000000
Run
Once you’ve run all those analyses and confirmed convergence/stationarity, the next step will be to process the BayesTraits outputs. This can be slightly complicated without a helper file. Use the process_PPP function in plotting_BayesTraits.R to process the outputs. You can do this following the RUN_BayesTraits_VarRates_Likelihoods.R script. The same script will also allow you to extract comparable AIC values for model comparison.
Now that we’ve fit all the models, get the model fits together so we can compare them.
load("pulsR_Results.Rdata", verbose=T)
## Loading objects:
## trait.fit
## params.list
## trait.aicw.all
## trait.fit.types
## trait.aicw.all
## trait.aicw.types
## aicw.df
## aicw.all.df
Since we have the objects already, plot them here.
# plot the 'type' results (lumps levy models together)
aicw.types.plot <- ggplot(aicw.df, aes(fill=model, y=aicw, x=trait)) +
geom_bar(position="stack", stat="identity") + theme_classic() +
scale_fill_brewer(palette="RdYlBu") +
theme(axis.text.x = element_text(angle = 45, hjust=1), legend.position="bottom")
# plot all the AICw (plots all levy models separately)
aicw.all.plot <- ggplot(aicw.all.df, aes(fill=model, y=aicw, x=trait)) +
geom_bar(position="stack", stat="identity") + theme_classic() +
scale_fill_brewer(palette="RdYlBu") +
theme(axis.text.x = element_text(angle = 45, hjust=1), legend.position="bottom")
# plot the two together
aicw.types.plot / aicw.all.plot
We will want to be able to visualize the results of our VR models.
load("BayesTraits_processed_traits.RData", verbose=T)
## Loading objects:
## all.BT
names(all.BT)
## [1] "all.trees" "all.sig" "all.res"
## [4] "scalar.trees" "mean.scalar.trees" "rate.trees"
Here’s an explanation of what each of those elements is:
1. all.trees holds a tree for each trait. The tree is
your input tree, rescaled by the mean estimated sigma (rate) value per
branch (Mean.SigV in BayesTraits terms).
2. all.sig holds the mean sigma value per branch in a
vector (?), this is the value each branch above was scaled by.
3. all.res holds a data frame for each trait. The data
frame includes the node and branch information, rate info, etc.
4. scalar.trees holds a tree for each trait. The tree
is your input tree, rescaled by the median scalar r (Median.Scalar in
BayesTraits terms).
5. mean.scalar.trees holds a tree for each trait. The
tree is your input tree, rescaled by the mean scalar r (Mean.Scalar in
BayesTraits terms).
6. rate.trees holds a tree for each trait. The tree is
your input tree, with each branch rescaled to the mean estimated sigma
value (Mean.SigV).
To visualize the rates and shifts across the tree use the plot.VarRates.tree function.
source("../Scripts/plotting_BayesTraits.R")
## Loading required package: RColorBrewer
## Loading required package: viridisLite
plot.VarRates.tree(BT = all.BT$all.res$BodyWidth, # res object for focal trait
phy = all.BT$scalar.trees$BodyWidth, # tree for focal trait
col.palette = "YlGnBu", # choose your color palette
legend = T, # want a rate legend included?
tree.type = "phylogram", # tree shape
log.rates = T, # should rates be logged (probably)
trait = "Body Width", # name of the trait
outline = F, # black outline on branches for light colors
pos.selection = T, # indicate positive selection (r >= 10)
shift.rates = T) # indicate shifted rates (r >= 2)
We can also apply this across all the traits simultaneously.
# plot all the trees with edges colored by rates
par(mfrow=c(2,2))
# plot the scalar transformed trees
for(j in 1:length(all.BT$all.trees)){plot.VarRates.tree(BT=all.BT$all.res[[j]], phy=all.BT$scalar.trees[[j]], col.palette="YlOrRd", legend=F, tree.type="phylogram", log.rates=T, trait=names(all.BT$all.res)[[j]], outline=F, pos.selection=T, shift.rates=T)}
Start by reading in the Median Scalar Trees (all.BT$scalar.trees) from file if you’d like
all.trees <- read.nexus(file="../Trees/BayesTraits_VarRates_Egerniinae_MorphTraits_Scalar.trees")
plot(all.trees$Body_Width, show.tip.label = F); axisPhylo()
# sort the order of the trees to match the columns
all.trees <- all.trees[colnames(all.traits)]
We’ll extract the ancestral trait values for each trait
anc.traits <- NULL; anc.all.traits <- NULL
for (k in 1:length(all.trees)){
base.trait <- all.traits[,k]; names(base.trait) <- rownames(all.traits)
anc.traits[[k]] <- fastAnc(all.trees[[k]], base.trait)
anc.all.traits[[k]] <- c(base.trait, anc.traits[[k]])
#phenogram(egernia.tree, x=c(base.trait, anc.trait), fsize=0.5)
}
names(anc.traits) <- names(all.trees)
names(anc.all.traits) <- names(all.trees)
Or, we can just read in all these bits from file.
load("Tiliquini_Ancestral_MorphTraits.Rdata", verbose=T)
## Loading objects:
## all.traits
## anc.traits
## anc.all.traits
## all.trees
all.traits is a data frame of the traits for all
living taxa.
anc.traits is a named list of vectors of ancestral
traits.
anc.all.traits is a named list of vectors of traits of
living taxa and ancestors.
all.trees is a named multiPhylo object of Median Scalar
trees.
Anyway, let’s move on and get the trait value along branches at given time intervals (here 0.25 million years). This will allow us to plot the traits through time.
all.anc <- lapply(1:length(anc.all.traits), function(x) trait.at.time(timeslices=0.25, phy=egernia.tree, trait.vector=anc.all.traits[[x]], plot=F))
names(all.anc) <- names(anc.all.traits)
# have a look at what the data look like
head(all.anc$Head_Width)
## time trait
## 58 0.00 0.9764456
## 61 0.00 0.9764456
## 581 0.25 0.9769712
## 611 0.25 0.9761035
## 582 0.50 0.9774968
## 612 0.50 0.9757614
Next we can extract the disparity across all living branches at given time intervals (same as the rate above). I call this Disparity-at-Time.
emp.dis <- lapply(1:length(all.anc), function(x) extract.variance(all.anc[[x]], plot = FALSE, metric = "disparity"))
names(emp.dis) <- names(all.anc)
emp.dis <- lapply(1:length(emp.dis), function(x) emp.dis[[x]] <- cbind(emp.dis[[x]], trait=names(emp.dis)[[x]]))
names(emp.dis) <- names(all.anc)
emp.dis.df <- NULL
for(k in 1:length(emp.dis)){emp.dis.df <- rbind(emp.dis.df, emp.dis[[k]])}
Show a single disparity plot
plot(emp.dis$Head_Width$measure ~ emp.dis$Head_Width$time,
xlab="Time", ylab=paste("Total", "Disparity"), type="l", main="Head Width")
#abline(v=30.06, col="#ABDDA4", lty=3, lwd=3)
plot(emp.dis$Head_Width$measure.rich ~ emp.dis$Head_Width$time,
xlab="Time", ylab=paste("Relative", "Disparity"), type="l", main="Head Width")
layout(matrix(1))
Now that we have the disparity values through time we can plot them for each trait.
ggplot(emp.dis.df) +
geom_line(aes(x=time, y=measure, color=trait)) +
facet_wrap(~trait, scales="free") +
theme_classic() + theme(legend.position="none")
Ok let’s apply the same principles to the rates. I’ll call this Rate-at-Time
# annoying, but to get the right order we have to rename the rate objects
sig.BT <- all.BT$all.sig
# reorder the objects
for(k in 1:length(sig.BT)){names(sig.BT[[k]])[1:56] <- egernia.tree$tip.label}
for(k in 1:length(sig.BT)){sig.BT[[k]] <- append(sig.BT[[k]], 0, after=Ntip(egernia.tree))}
for(k in 1:length(sig.BT)){names(sig.BT[[k]])[[57]] <- 57}
# actually extract the rates-at-time
all.RaT <- lapply(1:length(sig.BT), function(x) rate.at.time(timeslices=0.25, phy=egernia.tree, rate.vector=sig.BT[[x]]))
names(all.RaT) <- names(sig.BT)
# extract the 95% quantile around the estimated rate
all.rates <- lapply(1:length(all.RaT), function(x) extract.stat(all.RaT[[x]], stat="mean", plot = F, range="quantile"))
names(all.rates) <- names(all.RaT)
# combine the mean and quantiles into a single data frame
all.rates <- lapply(1:length(all.rates), function(x) all.rates[[x]] <- cbind(all.rates[[x]], trait=names(all.rates)[[x]]))
names(all.rates) <- names(all.RaT)
# reshape into a new dataframe
all.rates.df <- NULL
for(k in 1:length(all.rates)){all.rates.df <- rbind(all.rates.df, all.rates[[k]])}
Now that we have the rates through time for each trait, plot it.
ggplot(all.rates.df) +
geom_ribbon(aes(x=time, ymin=`5%`, ymax=`95%`, fill=trait, alpha=0.5)) +
geom_line(aes(x=time, y=rate, color=trait)) +
facet_wrap(~trait, scales="free") +
theme_classic() + theme(legend.position="none")
Here we can compare the rates through time for several traits. As an example I show the rates for the head module traits. This follows the right panel of Figure 3.
library(ggbreak)
all.rates.df$`5%`[which(all.rates.df$`5%`<=0)] <- 0
head.rates.df <- filter(all.rates.df, trait %in% c("HeadDepth","HeadWidth","PosSkull","EyeDiameter"))
p1 <- ggplot(head.rates.df) +
geom_ribbon(aes(x=rev(time), ymin=`5%`, ymax=`95%`, fill=trait), alpha=0.1) +
geom_line(aes(x=rev(time), y=rate, color=trait)) +
scale_color_brewer(palette="Spectral") +
scale_fill_brewer(palette="Spectral") +
scale_x_reverse() +
#scale_y_log() +
xlab("Time (Ma)") + ylab("Evolutionary Rate") +
theme_classic() + theme(legend.position="bottom")
p1
Up until now we’ve been looking at rates and disparity of individual traits. We’ll shift now to looking at trends in modules (head, body, tail, limbs).
source("../Scripts/rate.trajectory.R")
## Loading required package: tibble
load("BayesTraits_processed_modules.RData", verbose=T)
## Loading objects:
## all.BT
names(all.BT)
## [1] "all.trees" "all.sig" "all.res"
## [4] "scalar.trees" "mean.scalar.trees" "rate.trees"
# Rate Trajectory and Rate-to-Node for HEAD
HEAD.anc.module <- data.frame(Head_Width = anc.all.traits$Head_Width,
Snout_Eye = anc.all.traits$Snout_Eye,
Head_Depth = anc.all.traits$Head_Depth,
Pos_Skull = anc.all.traits$Pos_Skull)
HEAD.plot <- rate.trajectory.BT.module(tree = egernia.tree, module = HEAD.anc.module,
PPP.all.res = all.BT$all.res$Head,
tip.spread = c("Tiliqua_rugosa","Tiliqua_scincoides"),
focus = "clade", psize=3, lsize=2, background.color="grey",
gimme.the.data=F, inset=T)
We’ve run each of these modules through rate.trajectory in the script PLOT_rate.trajectory.R.
load("rate.trajectories.RData", verbose=T)
## Loading objects:
## rate.list
load("BayesTraits_ancestral_modules.RData", verbose=T)
## Loading objects:
## anc.modules
names(rate.list); names(anc.modules)
## [1] "head" "body" "tail" "limb"
## [1] "head" "body" "tail" "limb"
The rate.list list is made up of processed data frames
which hold information about the tree, BayesTraits results (rate/scalar
estimates, shift frequency in posterior, etc.), color scheme.
The anc.modules list is made up of data frames holding
the observed and estimated trait values for all tips and internal nodes
of the tree.
Using the rate.trajectory.BT.module function we can plot a traitgram/phenogram style with branches colored by evolutionary rate for any module and any clade (set of tips) or single tip of the tree. I’ll show you how.
Here’s a traitgram for the limb module highlighting the trajectory of the Tiliqua/Cyclodomorphus clade. You will notice the dark color on the stem branch of this clade suggesting a shift in evolutionary rate.
rate.trajectory.BT.module(tree = egernia.tree, module = anc.modules$limb,
PPP.all.res = rate.list$limb,
tip.spread = c("Tiliqua_rugosa","Cyclodomorphus_maximus"),
focus = "clade", psize=3, lsize=2, background.color="grey",
gimme.the.data=F, inset=T)
We can limit this to just a single tip, say Tiliqua rugosa like this:
rate.trajectory.BT.module(tree = egernia.tree, module = anc.modules$limb,
PPP.all.res = rate.list$limb,
tip.spread = c("Tiliqua_rugosa"),
focus = "tip", psize=3, lsize=2, background.color="grey",
gimme.the.data=F, inset=T)
We can also pick an entirely different clade, one that shows no sign of rate shifts. You’ll notice how consistent the rates are.
rate.trajectory.BT.module(tree = egernia.tree, module = anc.modules$limb,
PPP.all.res = rate.list$limb,
tip.spread = c("Liopholis_whitii", "Liopholis_personata"),
focus = "clade", psize=3, lsize=2, background.color="grey",
gimme.the.data=F, inset=T)
It’s great that we can show the rates along individual branches, but we might want to look at the rates of branches with inferred rate shifts against the background rate.
Since we already have the processed BayesTraits output files (in rate.list), we can just plug them into the rate.to.node.BT function.
Head Module
rate.to.node.BT(phy=egernia.tree, PPP.obj=rate.list$head, psize=3, lsize=2, log.rates=T, col.palette="YlOrRd")
Body Module
rate.to.node.BT(phy=egernia.tree, PPP.obj=rate.list$body, psize=3, lsize=2, log.rates=T, col.palette="YlOrRd")
Tail Module
rate.to.node.BT(phy=egernia.tree, PPP.obj=rate.list$tail, psize=3, lsize=2, log.rates=T, col.palette="YlOrRd")
Last we’ll do the morphotrajectory and sim-to-node plots.
source("../Scripts/morphotrajectory.R")
This will allow us to visualize the expansion of morphospace by comparing our empirical observations (and estimated ancestral values) against a set of Brownian Motion simulations.
We can start by looking at a remarkable example of Egernia stokesii tail proportions. Here we can see that the combination of tail traits for this species fall outside what we would expect from BM.
sim.to.node(phy=egernia.tree, VRphy=all.BT$scalar.trees$Tail,
trait=all.modules$tail[,c("Tail_Length","Tail_Width")],
tip="Egernia_zellingi", sim.num=500)
In comparison we can plot what the evolution of tail traits looks like for something quite standard, like Liopholis whitii. Here we see that the tail traits for this lizard are quite conservative, compared to the eovlutionary possibility under Brownian Motion.
sim.to.node(phy=egernia.tree, VRphy=all.BT$scalar.trees$Tail,
trait=all.modules$tail[,c("Tail_Length","Tail_Width")],
tip="Liopholis_whitii", sim.num=500)
# RUN_BayesTraits_VarRates_Output.R